Cloud Computing for Linguists

نویسندگان

  • Dorothee Beermann
  • Pavel Mihaylov
چکیده

The system presented is a web application designed to aid linguistic research with data collection and online publishing. It is a service mainly for linguists and language experts working with language description of less-documented and less-resourced languages. When the central concern is in-depth linguistic analysis, maintaining and administering software can be a burden. Cloud computing offers an alternative. At present mainly used for archiving, we extend linguistic web applications to allow creation, search and storage of interlinear annotated texts. By combining a conceptually appealing online glosser with an SQL database and a wiki, we make the online publication of linguistic data an easy task also for noncomputationally oriented researchers. 1 General description of TypeCraft TypeCraft (or TC in short) is a multilingual online database of linguistically-annotated natural language texts, embedded in a collaboration and information tool. It is an online service which allows users (projects as well as individuals) to create, store and retrieve structured data of the kind mainly used in natural language research. In a system featuring graded access the user may create his own domain, invite others, as well as share his data with the public. The kernel of TypeCraft is morphological word level annotation in a relational database setting, wrapped into a wiki which is used as a communication and information gathering and sharing tool. TypeCraft allows the import of raw text for storage and annotation and export of annotated data to MS Word, OpenOffice.org, LATEX and XML. The online system is complemented by an offline client which is a Java application offering the same functionality as the online version. This allows a seamless exchange of data between the server and the user’s own computer. 2 Online system internals The online system is supported by a central server running the following modules: TypeCraft server proper, an SQL database, Apache, MediaWiki. The client side consists of the TypeCraft editor interface and a wiki environment (content produced by MediaWiki on the server). Users perceive the wiki and the editor interface as a single TypeCraft web application. The TypeCraft server proper is a Java application running inside a Java application server. TypeCraft uses a PostgreSQL database for data storage. The data mapping between Java objects and database tables is managed by Hibernate, so the system is not bound to any specific SQL database. TypeCraft data can be divided into two distinct groups: common data, shared between all annotated tokens and users, such as the word and sentence level tag sets and an ISO 639-3 specification, and individual data, by which we mean specific texts, phrases, words and morphemes. Individual data references common data types. This for example means that all users of the system making use of the part of speech tag N share the reference to a single common tag N. 3 Digital linguistic data It is well known that generation of linguistic annotation of any kind is a time consuming enterprise quite independent of the form the primary data has and the tools chosen for processing this data. Equally well known are problems connected to the generation and storage of linguistic data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Literature Review on Cloud Computing Security Issues

The use of Cloud Computing has increasedrapidly in many organization .Cloud Computing provides many benefits in terms of low cost and accessibility of data. In addition Cloud Computing was predicted to transform the computing world from using local applications and storage into centralized services provided by organization.[10] Ensuring the security of Cloud Computing is major factor in the Clo...

متن کامل

A Literature Review on Cloud Computing Security Issues

The use of Cloud Computing has increasedrapidly in many organization .Cloud Computing provides many benefits in terms of low cost and accessibility of data. In addition Cloud Computing was predicted to transform the computing world from using local applications and storage into centralized services provided by organization.[10] Ensuring the security of Cloud Computing is major factor in the Clo...

متن کامل

IMPACTS AND CHALLENGES OF CLOUD COMPUTING FOR SMALL AND MEDIUM SCALE BUSINESSES IN NIGERIA

Cloud computing technology is providing businesses, be it micro, small, medium, and large scale enterprises with the same level playing grounds. Small and Medium enterprises (SMEs) that have adopted the cloud are taking their businesses to greater heights with the competitive edge that cloud computing offers. The limitations faced by (SMEs) in procuring and maintaining IT infrastructures has be...

متن کامل

An Effective Task Scheduling Framework for Cloud Computing using NSGA-II

Cloud computing is a model for convenient on-demand user’s access to changeable and configurable computing resources such as networks, servers, storage, applications, and services with minimal management of resources and service provider interaction. Task scheduling is regarded as a fundamental issue in cloud computing which aims at distributing the load on the different resources of a distribu...

متن کامل

Top Benefits and Hindrances to Cloud Computing Adoption in Saudi Arabia: A Brief Study

Cloud computing is an emerging concept of information technology that in many countries has an influence on many companies. The research was conducted to evaluate cloud computing adoption in Saudi Arabia; Benefits and hindrances for small and medium-sized enterprises (SMEs). The qualitative research approach is performed by interviews with the management of a variety of SMEs active in the infor...

متن کامل

Task Scheduling Algorithm Using Covariance Matrix Adaptation Evolution Strategy (CMA-ES) in Cloud Computing

The cloud computing is considered as a computational model which provides the uses requests with resources upon any demand and needs.The need for planning the scheduling of the user's jobs has emerged as an important challenge in the field of cloud computing. It is mainly due to several reasons, including ever-increasing advancements of information technology and an increase of applications and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010